• FetchFox is a Chrome extension for extracting any data from any website. It takes the raw text of a website and uses AI to extract data the user is looking for.

    Thursday, September 5, 2024
  • Pipet is a versatile command-line tool designed for scraping and extracting data from online sources, particularly aimed at developers and hackers. It operates in three primary modes: HTML parsing, JSON parsing, and client-side JavaScript evaluation. By leveraging existing tools like curl and utilizing Unix pipes, Pipet enhances its functionality, allowing users to automate data retrieval tasks efficiently. The tool can be employed for various practical applications, such as tracking shipments, monitoring ticket availability, and observing stock price fluctuations. Users can create Pipet files that define how to scrape specific data from websites. For instance, a simple Pipet file can be created to fetch the latest news from Hacker News, demonstrating the tool's straightforward syntax and ease of use. Pipet supports customization through various command-line flags. Users can specify custom separators for text output, output results in JSON format, or render data using templates. The tool also allows for monitoring changes on a webpage, enabling notifications when specific conditions are met. Installation of Pipet can be done through pre-built binaries, Go installation, or package managers like Arch Linux and Homebrew. The usage of Pipet requires only the path to a .pipet file, with additional flags available for enhanced functionality. The structure of a Pipet file consists of resource lines that define the URL and scraping method, query lines that specify the data to extract, and optional next page lines for pagination. Pipet's querying capabilities are robust, supporting HTML queries using CSS selectors, JSON queries with GJSON syntax, and Playwright queries that execute JavaScript in a headless browser environment. This flexibility allows users to extract data from a variety of sources, whether they are simple HTML pages or complex web applications. Overall, Pipet stands out as a powerful tool for data extraction, combining ease of use with advanced features that cater to the needs of developers looking to automate their data scraping tasks.